DEFSET.NEW[COM,LSP] - www.SailDart.org

perm filename DEFSET.NEW[COM,LSP] blob sn#713508 filedate 1983-05-28 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	Scott:
C00028 ENDMK
C⊗;
Scott:
	I looked at Moon's DEFSETF writeup. The wording was indeed
hairy, and I spent some time trying to untangle it. The result
of my re-wording is included below. I still don't like the ``theory of
SETFing'' section, but could not find much specifically wrong with
it, as I could with other parts. Maybe you look it over a little more
and, given the incentive of my work on it, try to fix that up as well.
			-rpg-

@comment[Insert this in place of the defsetf writeup in control.mss
	 The rest of section 7.2 is unchanged.]

Macros that manipulate generalized variables must guarantee the "obvious"
semantics:  Subforms of generalized-variable references are
evaluated exactly as many times as they appear in the source program, and
they are evaluated in exactly the same order as they appear in the source
program.

In generalized-variable references such as @l[shiftf], @l[incf], @l[push],
and @l[setf] of @l[ldb] the generalized variables are both read and
written in the same reference; preserving the source-program order of
evaluation and the number of evaluations is particularly important.

As an example of these semantic rules, in the generalized-variable reference,
@[setf], the value to be stored must be evaluated @i[after] all the
subforms of the reference since it appears to the right of them.

The expansion of these macros must consist of code that follows these
rules or has the same effect as such code.  This is accomplished by
introducing temporary variables bound to the subforms of the reference.
As an optimization, temporary variables may be eliminated whenever it
can be proven that this has no effect on the semantics of the program.
For example, a constant need never be saved in a temporary variable.
A variable, or any form that does not have side-effects, need not be
saved in a temporary variable if it can be proven that its value will
not change within the scope of the generalized-variable reference.

@clisp provides builtin facilities to take care of
these semantic complications and optimizations.  Since the required
semantics can be guaranteed by these functions, the user does not
have to worry about writing correct code form them, especially in
complex cases. Even experts can become confused and make mistakes
while writing code like this.

Another reason for providing these builtin functions is that the
optimizations that are appropriate will vary from implementation to
implementation.  In some implementations most of the optimization is
performed by the compiler, while in others a simpler compiler is used and
most of the optimization is performed in the macros.  The relative cost of
binding a temporary variable versus the cost of other Lisp operations may
vary widely among implementations, and some implementations may find it
best never to remove temporary variables except in the simplest cases.

A good example of the issues involved can be seen in the following
generalized-variable reference:
@lisp
(incf (ldb byte-field variable))
@endlisp
This ought to expand into
@lisp
(setq variable (dpb (1+ (ldb byte-field variable))
		    byte-field variable))
@endlisp
In this example expansion we have
ignored the further complexity of returning the correct
value, which is the incremented byte, not the new value of @l[variable].
Note that @l[byte-field] is evaluated twice, and @l[variable]
is accessed twice on the "right-hand side" and once on the
"left-hand side."  Consider the expression:
@lisp
(incf (ldb (aref byte-fields (incf i)) (aref words i)))
@endlisp
This ought to expand into:
@lisp
(let ((temp (incf i)))
  (setf (aref words temp)
	(dpb (1+ (ldb (aref byte-fields temp) (aref words temp)))
	     (aref byte-fields temp)
	     (aref words temp))))
@endlisp
Again we have ignored the complexity of returning the correct value.

The @clisp facilities provided to deal with these semantic issues include:
@begin(itemize)
Built-in macros such as @l[setf] and @l[push] that follow the semantic rules.

The @l[define-modify-macro] macro, which allows new generalized-variable
manipulating macros (of a certain restricted kind) to be defined easily.
It takes care of the semantic rules automatically.

The @l[defsetf] macro, which allows new types of generalized-variable references
to be defined easily.  It takes care of the semantic rules automatically.

The @l[define-setf-method] macro and the @l[get-setf-method] function, which
provide access to the internal mechanisms when it is necessary
to define a complicated new type of generalized-variable reference
or generalized-variable-manipulating macro.
@end(itemize)

@Defmac[Fun {define-modify-macro}, Args {@i[name] @i(lambda-list)
@i(function) @Mopt<@i[doc-string]>}] Define a read-modify-write macro
named @i[name].  An example of such a macro is @i[incf].  The first
subform of the macro will be a generalized-variable reference.
@l[function] is the function to apply to the old contents of the
generalized-variable to get the new contents.  @i[lambda-list] describes
the remaining arguments for the @i[function]; it may contain @l[&optional]
and @l[&rest].  @i[doc-string] is documentation for the macro @i[name]
being defined.

The expansion of a @l[define-modify-macro] is equivalent to the following, except
that it generates code that follows the semantic rules outlined above.
@lisp
(defmacro @i[name] (@i[reference] . @i[lambda-list])
  @i[doc-string]
  `(setf ,@i[reference] (@i[function] ,@i[reference] ,@i[arg1] ,@i[arg2]...)))
@endlisp
where @i[arg1], @i[arg2], ..., are the parameters appearing in @i[lambda-list];
appropriate provision is made for an @l[&rest] parameter.

As an example, @l[incf] could have been defined by:
@lisp
(define-modify-macro incf (&optional (delta 1)) +)
@endlisp

An example of a possibly useful macro that is not predefined in @clisp is:
@lisp
(define-modify-macro unionf (other-set &rest keywords) union)
@endlisp
@enddefmac

@Defmac[Fun {defsetf}, Args {@i[access-fn] @Mgroup'@!@i[update-fn]
@Mopt<@i[doc-string]> @Mor @/@i[lambda-list] (@i[store-variable])
@Mstar<@i[declaration] @mor @i[doc-string]> @Mstar<@i[form]>'}] Define how
to @l[setf] a generalized-variable reference where that reference will be
of the form (@i[access-fn] ...).  The value of a generalized-variable
reference can always be obtained simply by evaluating it, so @i[access-fn]
should be the name of a function or a macro.

The user of @l[defsetf] provides a description of how to store into the
generalized-variable reference and return the value that was stored (because
@l[setf] is defined to return this value).  @l[defsetf] takes care of
ensuring that subforms of the reference are only evaluated once and are
evaluated in the proper left-to-right order.  In order to do this,
@l[defsetf] requires that @i[access-fn] be a function or a macro
that evaluates its arguments.
Furthermore a @f[setf] of a call on @i[access-fn] will also evaluate
all of @i[access-fn]'s arguments; it cannot treat any of them specially.
This means that @l[defsetf] cannot be used to describe how to store into
a generalized variable that is a byte, such as @l[(ldb field reference)].
See also @Funref[define-setf-method], which gives the user additional control
at the cost of increased complexity.

A @f[defsetf] may take one of two forms.
The simple form of @f[defsetf] is
@Lisp
(defsetf @i[access-fn] @i[update-fn] @Mopt<@i[doc-string]>)
@Endlisp
The @i[update-fn] must name a function (or macro) that takes one more argument
than @i[access-fn] does.  When @f[setf] is given a @i[place]
that is a call on @i[access-fn], it expands into
a call on @i[update-fn] that is given all the arguments to
@i[access-fn] and also, as its last argument, the new value
(which must be returned by @i[update-fn] as its value).
For example,
@lisp
(defsetf symbol-value set)
@Endlisp
is built into the @clisp system.  This causes the form @l[(setf (symbol-value foo) fu)]
to expand into @l[(set foo fu)].

Note that
@lisp
(defsetf car rplaca)
@endlisp
would be incorrect, because @f[rplaca] does not return its last argument.

The complex form of @f[defsetf] looks like
@lisp
(defsetf @i[access-fn] @i[lambda-list] (@i[store-variable]) @i[body]...)
@endlisp
and resembles @Funref[defmacro].  The @i[body] forms
compute the expansion of a @l[setf] of a call on @i[access-fn].

@i[lambda-list] describes the arguments of @i[access-fn].  @l[&optional]
and @l[&rest] are permitted in @i[lambda-list].  Optional arguments may
have defaults and supplied-p flags.  @i[store-variable] describes the
value to be stored into the generalized-variable reference.
@Rationale{@i[store-variable] is enclosed
in parentheses to provide for a possible extension to multiple store variables,
receiving multiple values from the second subform of @f[setf].}

The @i[body] forms can be written as if the variables in @i[lambda-list]
were bound to subforms of the call on @i[access-fn] and the
@i[store-variable] was bound to the second subform of @l[setf].
However, this is not actually the case.  During the evaluation of the
@i[body] forms, these variables are bound to temporary variables,
generated with @f[gensym] or @f[gentemp], that will be bound by the
expansion of @l[setf] to the values of those subforms.  This permits the
@i[body] forms to be written without regard for order-of-evaluation
issues.  @f[defsetf] arranges for the temporary variables to be
optimized out of the final result in cases where that is possible.  In
other words, the best code possible in a particular implementation will
be generated.

Note that the code generated by the @i[body] forms must include provision
for returning the correct value (the value of @i[store-variable]).  This is
left to the @i[body] forms rather than being handled by @l[defsetf] because
in many cases this value can be returned at no extra cost, by calling a
function that simultaneously stores into the generalized variable and
returns the correct value.

For example,
@lisp
(defsetf subseq (sequence start &optional end) (new-sequence)
  `(progn (replace ,sequence ,new-sequence :start1 ,start :end1 ,end)
	  ,new-sequence))
@endlisp
@Enddefmac

The underlying theory by which @f[setf] and related macros arrange to
conform to the semantic rules given above is that from any
generalized-variable reference one may derive its "@f[setf] method,"
which describes how to store into that reference and which subforms of
it are evaluated.  (The use of the word "method" here has nothing to do
with message-passing or flavors.)  Knowing where the subforms are,
it is possible to avoid evaluating them multiple times or in the wrong
order.  A @f[setf] method for a given form can be expressed as five values:
@begin(itemize)
A list of temporary variables.

A list of values to bind them to (subforms of the given form).
These values must be evaluated in the order they appear in this list.

A second list of temporary variables, called "store variables."

A "storing form."

An "accessing form."
@end(itemize)

The store variables are to be bound to the values of the form to be
stored into the generalized variable.  In almost all cases we store
single values and there is only one store variable.

The storing form and the accessing form can be evaluated inside the
bindings of the temporary variables (and of the store variables, in the
case of the storing form).  The accessing form returns the value of the
generalized variable.  The storing form modifies the value of the
generalized variable and guarantees to return the store variable(s) as
its value(s); these are the correct values for SETF, INCF, etc. to
return.  The value returned by the accessing form is (of course)
affected by execution of the storing form, but otherwise either of these
forms may be evaluated any number of times.

The temporary variables and the store variables are gensyms in all
cases, so that there is never any issue of name clashes among them, or
between them and other variables in the program.  This is necessary to
make the special forms that do multiple @f[setf]s in parallel work
properly; these are @f[psetf], @f[shiftf], and @f[rotatef].  Computation
of the @f[setf] method must always create new gensyms; it may not return
the same ones every time.

Some examples of @f[setf] methods for particular forms:
@begin(itemize)
for a variable @l[x]
@lisp
() () (g0001) (setq x g0001) x
@endlisp

for @l[(car @i[exp])]
@lisp
(g0001) (@i[exp]) (g0002) (rplaca2 g0001 g0002) (car g0001)
@endlisp
where @f[rplaca2] is assumed to be a version of @f[rplaca] that returns
its second argument.
@end(itemize)

@Defmac[Fun {define-setf-method}, Args {@i[access-fn] @i[lambda-list]
@Mstar<@i[declaration] @mor @i[doc-string]> @Mstar<@i[form]>}] Define how
to @l[setf] a generalized-variable reference that is of the form
(@i[access-fn]...).  The value of a generalized-variable reference can
always be obtained simply by evaluating it, so @i[access-fn] should be the
name of a function or a macro.

@i[lambda-list] describes the subforms of the generalized-variable
reference, as in @Funref[defmacro].  The result of evaluating the
@i[forms] in the body must be five values---the @l[setf] method described
above.  Note that @f[define-setf-method] differs from the complex form of
@f[defsetf] in that while the body is being executed the variables in
@i[lambda-list] are bound to parts of the generalized-variable reference,
not to temporary variables that will be bound to the values of such parts.
In addition, @f[define-setf-method] does not have @f[defsetf]'s
restriction that @i[access-fn] must be a function or a function-like
macro; an arbitrary @f[defmacro] destructuring pattern is permitted in
@i[lambda-list].

By definition there are no small examples of @f[define-setf-method], since
the easy cases are all handled by @f[defsetf].  A typical use is to define
the @l[setf] method for @f[ldb]:
@lisp
(define-setf-method ldb (byte word)
  (multiple-value-bind (word-temps word-vals word-stores
			word-store-form word-access-form)
      (get-setf-method word)		;Find out how to access @l[word]
    (let ((btemp (gensym))		;Temporary variable for byte specifier
	  (store (gensym))		;Temporary variable for byte to store
	  (wtemp (first word-stores)))	;Temporary variable for word to store
      (values (cons btemp word-temps)
	      (cons byte word-vals)
	      (list store)
	      `(ldb ,btemp		;To return the right value
		    (let ((,wtemp (dpb ,store ,btemp ,word-access-form)))
		      ,word-store-form))
	      `(ldb ,btemp ,word-access-form)))))
@endlisp
Here we rely on the compiler to optimize out the @l[ldb] in the fourth
value returned (the storing form) if the value is not actually going
to be used.
@enddefmac

@Defun[Fun {get-setf-method}, Args {@i[form]}]
Return five values, the @l[setf] method for @i[form], which must be a
generalized-variable reference.  @f[get-setf-method] takes care of
error-checking and macroexpansion and guarantees to return exactly one
store-variable.
@EndDefun

@Defun[Fun {get-setf-method-multiple-value}, Args {@i[form]}]
Return five values, the @l[setf] method for @i[form], which must be a
generalized-variable reference.  This is the same as @f[get-setf-method]
except that it does not check the number of store-variables; use this
in cases that allow storing multiple values into a generalized variable.
There are no such cases in standard @clisp, but this function is provided
to allow for possible extensions.
@EndDefun

A simplified version of @f[setf], allowing no more and no less than two
subforms, containing no optimization to remove unnecessary variables, and
not allowing storing of multiple values, could be defined by:
@lisp
(defmacro setf (reference value)
  (multiple-value-bind (vars vals stores store-form)
      (get-setf-method reference)
    `(let ,(mapcar #'list (append vars stores) (append vals (list value)))
       ,store-form)))
@endlisp